Choosing the content of textual summaries of large time-series data sets
نویسندگان
چکیده
Natural Language Generation (NLG) can be used to generate textual summaries of numeric data sets. In this paper we develop an architecture for generating short (a few sentences) summaries of large (100KB or more) time-series data sets. The architecture integrates pattern recognition, pattern abstraction, selection of the most significant patterns, microplanning (especially aggregation), and realisation. We also describe and evaluate SumTime-Turbine, a prototype system which uses this architecture to generate textual summaries of sensor data from gas turbines.
منابع مشابه
A Hybrid Time Series Clustering Method Based on Fuzzy C-Means Algorithm: An Agreement Based Clustering Approach
In recent years, the advancement of information gathering technologies such as GPS and GSM networks have led to huge complex datasets such as time series and trajectories. As a result it is essential to use appropriate methods to analyze the produced large raw datasets. Extracting useful information from large data sets has always been one of the most important challenges in different sciences,...
متن کاملSumTime-Turbine: A Knowledge-Based System to Communicate Gas Turbine Time-Series Data
SumTime-Turbine produces textual summaries of archived timeseries data from gas turbines. These summaries should help experts understand large data sets that cannot be visually presented in a single graphical display. SumTime-Turbine is based on pattern detection, knowledge-based temporal abstraction (KBTA), and natural language generation (NLG) technology. A prototype version of the system has...
متن کاملSummarizing Neonatal Time Series Data
We describe our investigations in generating textual summaries of physiological time series data to aid medical personnel in monitoring babies in neonatal intensive care units. Our studies suggest that summarization is a communicative task that requires data analysis techniques for determining the content of the summary. We describe a prototype system that summarizes physiological time series.
متن کاملSUMTIME: Observations from KA for Weather Domain
SUMTIME (http://www.csd.abdn.ac.in/research/sumtime) is a research project aimed at developing a generic computational model for producing textual summaries of time series data. This report summarises some of the observations made during the initial knowledge acquisition sessions carried out in the weather forecasting domain. Based on these observations, we describe a two-stage model for conten...
متن کاملIGR For GR/M76881/01: Generating Summaries of Time-Series Data (SumTime) Background/Context
Background/Context The goal of the SumTime project was to develop better techniques for generating English summaries of numerical time-series data. The modern world is being flooded with such data. For example, a typical gas-turbine has 250 sensors, each sampling once per minute. This produces 200MB of data per day, which a maintenance engineer may have one hour (per day) to attempt to understa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Natural Language Engineering
دوره 13 شماره
صفحات -
تاریخ انتشار 2007